Streaming Dictionary Matching with Mismatches
نویسندگان
چکیده
In the k-mismatch problem we are given a pattern of length n and text must find all locations where Hamming distance between is at most k. A series recent breakthroughs have resulted in an ultra-efficient streaming algorithm for this that requires only $$\mathcal {O}(k \log \frac{n}{k})$$ space {O}(\log \frac{n}{k} (\sqrt{k k} + ^3 n))$$ time per letter (Clifford, Kociumaka, Porat, SODA 2019). work, consider strictly harder called dictionary matching with k mismatches. problem, d patterns, each n, substrings within from one patterns. We develop ^k \mathop {\mathrm {polylog} {\,n}})$$ ^{k} {\,n}} |\mathrm {output}|)$$ position text. The randomised outputs correct answers high probability. On lower bound side, show any mismatches $$\varOmega (k d)$$ bits space.
منابع مشابه
A Practical Index for Approximate Dictionary Matching with Few Mismatches
Approximate dictionary matching is a classic string matching problem, with applications in, e.g., online catalogs, geolocation (mapping possibly misspelled location description to geocoordinates), web searchers, etc. We present a surprisingly simple solution, based on the Dirichlet principle, for matching a keyword with few mismatches and experimentally show that it offers competitive space-tim...
متن کاملStreaming Periodicity with Mismatches
We study the problem of finding all k-periods of a length-n string S, presented as a data stream. S is said to have k-period p if its prefix of length n− p differs from its suffix of length n− p in at most k locations. We give a one-pass streaming algorithm that computes the k-periods of a string S using poly(k, logn) bits of space, for k-periods of length at most n 2 . We also present a two-pa...
متن کاملParameterized matching with mismatches
The problem of approximate parameterized string searching consists of finding, for a given text t = t1t2 . . . tn and pattern p = p1p2 . . . pm over respective alphabets Σt and Σp , the injection πi from Σp to Σt maximizing the number of matches between πi(p) and ti ti+1 . . . ti+m−1 (i = 1,2, . . . , n −m + 1). We examine the special case where both strings are run-length encoded, and further ...
متن کاملFast String Matching with Mismatches
We describe and analyze three simple and fast algorithms on the average for solving the problem of string matching with a bounded number of mismatches. These are the naive algorithm, an algorithm based on the Boyer-Moore approach, and ad-hoc deterministic nite automata searching. We include simulation results that compare these algorithms to previous works.
متن کاملOn String Matching with Mismatches
In this paper, we consider several variants of the pattern matching with mismatches problem. In particular, given a text T = t1t2 · · · tn and a pattern P = p1p2 · · · pm, we investigate the following problems: (1) pattern matching with mismatches: for every i, 1 ≤ i ≤ n −m + 1 output, the distance between P and titi+1 · · · ti+m−1; and (2) pattern matching with k mismatches: output those posit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Algorithmica
سال: 2021
ISSN: ['1432-0541', '0178-4617']
DOI: https://doi.org/10.1007/s00453-021-00876-x